Distribution Free Decomposition of Multivariate Data SPR'98 Invited submission
نویسندگان
چکیده
We present a practical approach to nonparametric cluster analysis of large data sets. The number of clusters and the cluster centers are automatically derived by mode seeking with the mean shift procedure on a reduced set of points randomly selected from the data. The cluster boundaries are delineated using a k-nearest neighbor technique. The proposed algorithm is stable and e cient, a 10000 point data set being decomposed in only a few seconds. Complex clustering examples and applications are discussed, and convergence of the gradient ascent mean shift procedure is demonstrated for arbitrary distribution and cardinality of the data.
منابع مشابه
Regionalization of landscape pattern indices using multivariate cluster analysis.
Regionalization, or the grouping of objects in space, is a useful tool for organizing, visualizing, and synthesizing the information contained in multivariate spatial data. Landscape pattern indices can be used to quantify the spatial pattern (composition and configuration) of land cover features. Observable patterns can be linked to underlying processes affecting the generation of landscape pa...
متن کاملDistribution-free tests for polynomial regression based on simplicial depth
A general approach for developing distribution free tests for general linear models based on simplicial depth is presented. In most relevant cases, the test statistic is a degenerated U-statistic so that the spectral decomposition of the conditional expectation of the kernel function is needed to derive the asymptotic distribution. A general formula for this conditional expectation is derived. ...
متن کاملParallel Incomplete Cholesky Preconditioners Based on the Non-Overlapping Data Distribution
The paper analyses various parallel incomplete factorizations based on the non-overlapping domain decomposition. The general framework is applied to the investigation of the preconditioning step in cg-like methods. Under certain conditions imposed on the nite element mesh, all matrix and vector types given by the special data distribution can be used in the matrix-by-vector multiplications. Not...
متن کاملMann - Withney multivariate nonparametric control chart.
In many quality control applications, the necessary distributional assumptions to correctly apply the traditional parametric control charts are either not met or there is simply not enough information or evidence to verify the assumptions. It is well known that performance of many parametric control charts can be seriously degraded in situations like this. Thus, control charts that do not requi...
متن کاملOn the non-parametric multivariate control charts in fuzzy environment
Multivariate control chats are generally used in situations where the simultaneous monitoring or control of two or more related quality characteristics is necessary. In most processes in the real world, distribution of the process characteristics are unknown or at least non-normal, so the non-parametric or distribution-free charts are desirable. Most non-parametric statistical process-control t...
متن کامل